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1.   Introduction 

This  paper  presents  a  simple  heuristic  that  has  unified  some  previously 
separate  areas  of  theory  and  has  produced  some  surprising  computational  ad- 
vantages.  The  underlying  idea  can  be  stated  as  follows.   Suppose  that  we  want 
to  maximize  a  concave  function  v(y)  over  a  convex  set  Y.   Let  B  denote  a  "box" 
(i.e.,  hyper-cube)  for  which  YllB  is  non-empty.   Let  y*  be  a  point  at  which 
v(y)  achieves  its  maximum  over  YIB.   If  y*  lies  in  the  Interior  of  the  box, 
then  by  the  concavity  of  v,  y*  must  be  globally  optimal.   If,  on  the  other 
hand,  y*  lies  on  the  boundary  of  the  box,  then  we  can  translate  B  to  obtain 
a  new  box  B'  centered  at  y*  and  try  again.   By  "try  again"  we  mean  maximize 
V  over  YflB'  and  check  to  see  if  the  solution  is  in  the  interior  of  B'.   This 
intuitive  idea  is  developed  rigorously  in  section  2.   Note  immediately  that 
we  presuppose  some  appropriate  algorithm  for  solving  each  local  problem.  This 
"appropriate  algorithm"  is  embedded  in  a  larger  iterative  process,  namely 
maximizing  v  over  a  finite  sequence  of  boxes.   Computational  advantage  can 
be  derived  if  each  local  problem  with  feasible  region  YflB  is  significantly 
easier  to  solve  than  the  global  problem  with  feasible  region  Y. 

The  problems  that  we  have  in  mind  are  those  where  v(y)  is  the  optimal 
value  of  a  sub-problem  (SPy)  that  is  parameterized  on  y.   Thus  v(y)  is  not 
explicitly  available  and  evaluating  it  at  y  means  solving  (SPy) .   In  this 
context,  the  motivation  for  the  method  lies  in  the  empirical  observation 
that  the  number  of  times  (SPy)  must  be  solved  in  order  to  maximize  v(y)  over 
YflB  can  be  controlled  by  adjusting  the  size  of  the  box  B.   This  behavior  is 
extremely  Important  in  decomposition  methods  or  other  problems  where  the 
evaluation  of  v(y)  Imposes  a  real  computational  burden. 
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We  begin  by  presenting  BOXSTEP  in  a  more  general  context  so  as  to 
facilitate  Its  application  to  other  kinds  of  problems,  e.g.  non-linear 
programs  where  v(y)  is  explicitly  available.   In  this  case  BOXSTEP  bears 
some  resemblance  to  the  Method  of  Approximation  Programming  (MAP)  originally 
proposed  by  Griffith  and  Stewart  [11]  and  recently  revived  by  Beale  [A  ]  and 
Meyer  [20].   Section  2  presents  the  BOXSTEP  method  in  very  general  terms  and 
proves  its  convergence.   Section  3  shows  how  an  outer  approximation  scheme 
can  be  used  to  solve  each  local  problem.   In  this  form  BOXSTEP  falls  between 
the  feasible  directions  methods  at  one  extreme  and  outer  approximation  or 
cutting  plane  methods  at  the  other  extreme.   One  can  obtain  an  algorithm  of 
either  type,  or  something  "in  between",  by  simply  adjusting  the  size  of  the 
box.   Sections  4  through  7  contain  specific  applications  to  large  structured 
linear  programs.   Section  8  points  out  some  promising  directions  for  additional 
research. 
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2.   The  BOXSTEP  Method 

BOXSTEP  is  not  a  completely  specified  procedure  but  rather  a  method 
of  replacing  a  single  difficult  problem  by  a  finite  sequence  of  simpler 
problems.   These  simpler  problems  are  to  be  solved  by  an  appropriate  algo- 
rithm.  This  "appropriate  algorithm"  may  be  highly  dependent  on  problem 
structure  but  by  assuming  its  existence  and  convergence  we  can  establish 
the  validity  of  the  overall  strategy.   In  this  section,  therefore,  we 
present  a  general  statement  of  the  BOXSTEP  method,  prove  its  finite  e- 
optimal  termination,  and  discuss  a   modification  of  the  basic  method 
which  will  not  upset  the  fundamental  convergence  property. 

Consider  any  problem  of  the  form 

(P)       max  v(y)  ,   with  Y  <=  r"  and  v:Y-R  . 
yCY 

If,  for  yeY  and  P  >  0,  the  local  problem 

P(y;  P)   "»ax   v(y)   s.t.    lly-y|L  ^  P 
yeY 

is  considerably  easier  to  solve,  either  initially  or  In  the  context  of  a 

reoptimization,   then  (P)  is  a  candidate  for  the  BOXSTEP  method. 

BOXSTEP  Method 

Step  1;   Choose  y  e Y,  e  i  0,  P  >  0.  Let  t  =  1. 

Step  2;   Using  an  appropriate  algorithm,  obtain  an  e-optimal  solution 
of  P(y  ;  P),  the  local  problem  at  y  .  Let  y    denote  this 
solution. 

Step  3;   If  v(y   )  s  v(y  )  +  «,  stop.  Otherwise  let  t  =  t+1  and  go 
to  Step  2. 

The  BOXSTEP  mnemonic  comes  from  the  fact  that  at  each  execution  of 
Step  2  the  vector  y  is  restricted  not  only  to  be  in  the  set  Y  but  also  in 
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a  box  of  size  23  centered  at  y  and  the  box  steps  toward  the  solution  as  t  is 
incremented.   The  appeal  of  this  restriction  springs  both  from  heuristics  and 
empirical  observations  which  are  discussed  below.  In  essence,  these  results 
indicate  that  3=  -H»,  which  corresponds  to  solving  problem  (P)  all  at  once, 
is  not  an  optimal  choice.  Notice  that  the  stopping  condition  at  Step  3  is 
based  on  the  objective  fiinction  rather  than  on  y    being  in  the  interior  of 
the  box.   This  is  necessary  because  v(y)  may  have  multiple  maxima,  in  which 
case  we  might  never  obtain  an  interior  solution. 

The  simplicity  of  the  concept  would  indicate  that  the  convergence 
of  any  algorithm  is  not  upset  when  embedded  in  the  BOXSTEP  method.   This  is 
formally  verified  in  the  subsequent  theorem.  Assuming  Y  to  be  compact,  let 

<S  =  max   ||x-y||2      (the  "diameter"  of  Y) , 
x,yeY 


\   =   min  {3/6,  1}  , 


and 


V*  =  max  v(y) 
yeY 

Theorem;   If  Y  is  a  compact  convex  set  and  v  is  an  upper  semi-continuous 

concave  function  on  Y,  then  the  BOXSTEP  method  will  terminate  after  a  finite 

number  of  steps  with  a  2e/X -optimal  solution. 

Proof:   First  we  establish  that  the  method  terminates  finitely.   If  e>0,  then 

non-termination  implies  that  v(y   )  >  v(y  )  +  te  and,  therefore,  lim  sup 

v(y  )  =  «>.   This  contradicts  the  fact  that  an  upper  semi-continuous  function 

achieves  its  maximum  on  a  compact  set. 

If  e=0,  then  either  | | y  -y   | |^=3  for  each  t  or  | |y  -y   | |  <3  for  some 
T.   If  ||y  -y   M  <3  then  the  norm  constraints  are  not  binding  and,  by  concavity. 
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T+1 
may  be  deleted.   This  implies  that  v(y   )  =  v*  and  terminiation  must  occur 

on  the  next  iteration.   If  | |y  -y    ||„=3  for  each  t,  then,  without  termination, 
v(y   )>v(y  )  for  each  t.   If  | |y  -y  | |„<3/2  for  any  s>t,  then  this  would  con- 
tradict the  construction  of  y    as  the  maximum  over  the  box  centered  at  y 
(because  e=0) .  Therefore  [ |y  -y  | |^>3/2  for  all  s>t  and  for  each  t.  This  con- 
tradicts the  compactness  of  Y.   Hence  the  method  must  terminate  finitely. 

T+1       T 
When  termination  occurs,  say  at  step  T,  we  have  v(y   )  ^  v(y  )  +  e. 

*  *     * 

Let  y  be  any  point  such  that  v(y  )  =  v  .   Then,  by  concavity, 

v((l-X)y^  +  \y*)  ^   (1-X)v(y"^)+Xv(y*). 

Now  the  definition  of  X  implies  that 

ll(l-X)y'^  +  Xy*-y\  ^  3  , 

T+1 
and  by  the  construction  of  y    it  follows  that 

v(y'^'^^)  +  e  s  v((l  -  X)y'^  +  Xy*)  ^  (1  -  X)v(y^)  +\v*. 

Therefore,  since  termination  occurred, 

v(y^)  +  2«  s  (1  -  \)v(y'^)  +  Xv* 

or 

Xv(y^)  +  2e  ^  Xv  . 


Hence, 


vXy*"^)  ^  V*-  2€/X. 


Q.E.D. 


If  v(y)  is  piecewise  linear,  then  we  may  take  e=0.   In  general, 
however,  the  requirement  that  we  obtain  an  E-optimal  solution  of  each 
local  problem  means  that  e  must  be  strictly  positive. 
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As  we  shall  see  below,  a  number  of  alternatives  are  available  for 

determining  the  next  position  of  the  box  after  each  local  optimization. 

Moving  the  center  of  the  box  from  y  to  y    is  the  simplest  possible 

tactic.   An  important  alternative  would  be  to  perform  a  line  search  in  the 

direction  d  =y   -y  .  At  Step  3  we  would  replace  y    by  y  +9  d  where  9 

is  optimal  for 

(LS)      max  v(yVed^)  s.t.  yV9d^eY. 
9>1 

This  modification  of  the  basic  method  does  not  require  any  change  in  the 

statement  of  the  theorem.   Far  from  the  global  optimum,  therefore,  BOXSTEP 

could  be  viewed  as  a  feasible  directions  method  which  uses  more  than  strictly 

local  information  to  determine  the  next  direction  of  search.  Once  the  global 

optimum  is  contained  in  the  current  box,  however,  BOXSTEP  is  simply  a  restricted 

version  of  the  algorithm  chosen  to  execute  Step  2. 
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3.   Implementation:   solving  the  local  problem  by  outer  approximation. 

We  now  specify  that  for  the  remainder  of  this  paper  Step  2  of  the 
BOXSTEP  method  is  to  be  executed  with  an  outer  approximation  (cutting 
plane)  algorithm.  Thus,  in  the  framework  of  [7  ],  each  local  prob- 
lem will  be  solved  by  Outer  Linearization/Relaxation. 

By  convexity,  both  v  and  Y  can  be  represented  in  terms  of  the  family 
of  their  linear  supports.  Thus 

(3.1)  v(y)  =  min  (fVy) 

keK 

(3.2)  Y  =  {yeR"|p^+qVo  for  JeJ} 

i  k  i       k     n 

where  J  and  K  are  index  sets,   p"*   and  f     are  scalar s,    and  q"' ,   g  eR   . 

These  are  such  that 

k  k 

a)  for  each  yeY  there  is  a  keK  with  v(y)  =  f  +g  y;  and 

b)  for  each  y  on  the  boundary  of  Y  there  is  a  jeJ  with  p  +q  y  =  0  , 

In  the  applications  presented  in  sections  A  through  7  the  function  v(y) 
represents  the  optimal  value  of  a  subproblem  that  is  parameterized  on  y. 
The  set  Y  contains  all  points  y  for  which  (SPy)  is  of  interest.  When 
yEY  the  algorithm  for  (SPy)  produces  a  linear  support  for  v  at  y,  but 

if  y^Y  it  produces  a  constraint  that  is  violated  at  y.   Thus  in  the  former 

k*      k* 
case  we  get  f   and  g   such  that 

k*  k*.^ 

(3.3)  v(y)  =  f*"  +g''  y 

1*      1* 
while  in  the  latter  case  we  get  p-^   and  q-^   such  that 

(3.4)  pJ  H-qJ  y<0. 

Given  the  representations  in  (3.1)  and  (3.2),  the  local  problem  at 
any  yeY  can  be  written  as 
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P(y  ;  P)       max  o 

subject  to 

k   k 

f  +  g  y  ^  o      for  k  e  K 

pJ  +  q-^y  ^0      for  j  e  J 
y^  -  P  ^  y^^  =^  y^  +  P     f  or  i  =  l , . . . ,  n  . 

Let  P(y  ;  P)  denote  P(y  ;  p)  with  J  replaced  by  a  finite  J  ci  j  and  K 
replaced  by  a  finite  K  ci  K.  Thus  P(y  ;  g)  is  a  linear  program  and  a  re- 
laxation of  P(y  ;  p).  The  outer  approximation  algorithm  for  Step  2  of 
the  BOXSTEP  method  can  then  be  stated  as  follows. 
Step  2a;  Choose  an  initial  J  and  K. 
Step  2b:   Solve  the  linear  program  P(y  ;  P)  to  obtain  an  optimal  solution 

(y,  <^)' 

step  2c;   If  yeY,  continue  to  2d.  Otherwise  determine  p   and  q^  such  that 

j*   i*A  ^  ^ 
P   +q-^  y  <  0. 

Set  J  =  J  U  {j*}  and  go  to  2b. 

Step  2d:      Determine  f       and  g       such   that 

,A  k*^    k*- 

v(y)  =  f      +g     y  . 

If  v(y)  <a.e,  setK  =  KU  [k*}  and  go  to  2b. 

Step  2e;   Done;  (y,  5)  is  €-optimal  for  P(y  ;  P). 

The  convergence  properties  of  this  procedure  are  discussed,  for  example,  in 
Zangwill  [22]  or  Luenberger  [17], 

It  is  not  necessary  to  solve  every  P(y  ;  P)  to  completion  (i.e.,  e- 
optimality).   If  a  tolerance  A  >  0  is  given  and  if  at  Step  2d  we  find  that 

(3.5)     v(y)  >  v(y'')+A 


-9- 

then  we  could  immediately  move  the  center  of  the  box  to  y  and  set  y   =y. 
This  is  one  of  many  hueristics  that  would  eliminate  unnecessary  computation 
far  from  the  global  maximum. 

Step  2b  requires  the  solution  of  a  linear  program.   The  role  of  reopti- 
mlzation  within  a  single  execution  of  Step  2  is  obvious.   However,  between 
successive  executions  of  Step  2,  i.e.  between  successive  local  problems,  there 
is  an  opportunity  for  reoptimization  arising  from  the  fact  that  the  constraints 
indexed  by  J  and  K  are  valid  globally.   Thus  at  Step  2a  we  may  choose  the 
initial  J  and  K  to  include  any  constraints  generated  in  earlier  boxes  and 
accumulate  a  description  of  v  and  Y. 

In  the  case  where  v(y)  is  the  optimal  value  of  (SPy) ,  the  power  of  the 
BOXSTEP  method  rests  upon  two  observations.  First,  the  number  of  times  that 
(SPy)  must  be  solved  in  order  to  solve  a  local  problem  can  be  controlled  by 
adjusting  the  size  of  the  box.  For  a  larger  box,  more  solutions  of  (SPy) 
(i.e.  linear  supports  of  v  and  Y)  are  required.  This  empirical  finding  is 
documented  in  the  computational  results  which  follow.  Second,  reoptimization 
of  (SPy)  is  greatly  facilitated  because  the  successive  points  y  for  which  the 
solution  of  (SPy)  is  needed  are  prevented,  by  the  box,  from  being  very  far 
apart.  For  example,  when  (SPy)  is  a  network  problem  whose  arc  costs  depend 
on  y  reoptimization  is  very  fast  for  small  changes  in  y  even  though  it  may 
equal  the  solution-from-scratch  time  for  large  changes  in  y. 

Ascent  and  feasible  directions  methods  typically  exploit  prior  information 
in  the  form  of  a  good  initial  estimate  but  the  information  generated  during  the 
solution  procedure  is  not  cumulative.  Outer  approximation  methods,  in  con- 
trast, do  not  exploit  prior  information  but  the  cuts  generated  during  solution 
are  cinnulative.   The  movement  of  the  box  and  the  accumulation  of  cuts  place  the 
BOXSTEP  method  in  the  conceptual  continuum  between  these  two  extremes  and  captures 
features  of  both. 
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The  desire  to  use  the  framework  of  column  generation  for  the 
reoptimization  techniques  at  Step  2b  dictates  that  we  work  on  the  dual  of 
P(y  ;  3)-  We  record  this  dual  here  for  future  reference. 


(3.6)     TninE_  f\  +  S_pV.  -  ^  (yf  -  p)  ^l"  +  ^(ySp)  6; 
keK    "   jej    J   i=r  ^  '     ^         i=l^  ^  '     ^ 

s.t.  Tj_\  =  \ 
keK  ^ 

S_(-g^)X  +  E_(-qj)^i.  -  IS"*"  +  16"  =  0 
k€K         j«J      ^ 

X,  tx,  6+,  6"  5  0  , 

The  similarity  of  (3.6)  to  a  Dantzig-Wolfe  master  problem  will  be  commented 
upon  in  the  next  section. 

All  of  the  computational  results  presented  in  the  subsequent  sections 
were  obtained  by  implementing  the  BOXSTEP  method,  as  described  above,  within 
the  SEXOP  linear  programming  system  [19] . 
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4.  Application:  Dantzig-Wolfe  decomposition 

Consider  the  linear  program 

(DW)      min  ex   s . t .   Ax<b 
xeX 

where  X  is  a  non-empty  poly tope  and  A  is  an  (mxn)  matrix.   This  is  the 
problem  addressed  by  the  classical  Dantzig-Wolfe  decomposition  method  [ 5  ] . 
It  is  assumed  that  the  constraints  determined  by  A  are  coupling  or  "compli- 
cating" constraints  in  the  sense  that  it  is  much  easier  to  solve  the  La- 
grangean  subproblem 

(SPy)     min  ex  +  y(Ax-b) 

xeX 

for  a  given  value  of  y  than  to  solve  (DW)  itself.   If  we  let  v(y)  denote 
the  minimal  value  of  (SPy) ,  then  the  dual  of  (DW)  with  respect  to  the 
coupling  constraints  can  be  written  as 

(4.1)  max  v(y). 
y>0 

Let  {x  |kEK}  be  the  extreme  points  and  {z^ljej}  be  the  extreme  rays 

of  X.   Then  v(y)>  -  °°  if  and  only  if  yeY'  where 

(4.2)  Y'  =  {yER™|cz^+yAzJ>  0      for  jeJ} 
and  when  yeY'  we  have 

(4.3)  v(y)  =  min  cx''+y(Ax'^-b) 

keK 

The  set  Y  of  interest  is  the  intersection  of  Y'  with  the  non-negative 
orthant.   Thus  Y  and  v(y)  are  of  the  form  discussed  in  section  3.   (Note 
that  y  is  a  row  vector.)   In  this  context  the  local  problem  is  to  maximize 
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the  Lagrangean  function  v(y)  over  a  box  centered  at  y  .   This  can  be  expressed 

as: 

(4.4)     max  a-yb 

s.t. 

k    k 
a-y(Ax  )<cx  for  keK 

-y(Az-')<cz-'  for  jeJ 

max  {0,y^-e}<y^<y^+B  for  1=1,..., m 

and  the  problem  solved  at  Step  2b  (see  (3.6))  is 


(4.5)  mln  51  (cx^)X,  +  2-  (cz^)y  -  Z.  max  {0,  y^  -g}  6^!"  +  L  \y^   +3  jsT 

keK      ^   jeJ      J   1=1         ^      ^   1=1  ^  ^    ^  ^ 

s.t. 

21a   =  1 

keiC  "^ 

21(-Ax^)X,  +  2.  i-kzhv.   -   is"*"  +  I6~  =  -b 
k£K  jeJ       J 

X,  y,  6"^,  6~  >  0. 

If  the  point  y  Is  In  fact  the  origin  (y  =0)  then  the  objective 

function  of  (4.5)  becomes 

m 

(4.6)  X  (cx'^)X  +  Z.    (czJ)y  +  1.    &S~ 
keic  jeJ      ^1=1 

and  hence.  If  3  Is  very  large,  (4.5)  becomes  exactly  the  Danzig-Wolfe  Master 

problem.  There  Is  a  slack  variable  6.  and  a  surplus  variable  6  for  each  row. 

Since  the  constraints  were  Ax<b  each  slack  variable  has  zero  cost  while  each 

surplus  variable  is  assigned  the  positive  cost  3.   The  cost  3  must  be  large 

enough  to  drive  all  of  the  surplus  variables  out  of  the  solution.   In  terms 
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of  the  BOXSTEP  method,  then,  Dantzlg-Wolfe  decomposition  Is  simply  the  solu- 
tion of  one  local  problem  over  a  sufficiently  large  box  centered  at  the 
origin.   The  appearance  of  a  surplus  variable  in  the  final  solution  would 
indicate  that  the  cost  3  was  not  large  enough,  i.e.  that  the  box  was  not 
large  enough. 

The  test  problem  that  we  have  used  is  a  linear  program  of  the  form 
(DW)  which  represents  a  network  design  model.   The  matrix  A  has  one  row  for 
each  arc  in  the  network.   The  set  X  has  no  extreme  rays,  hence  Y  is  just  the 
non-negative  orthant.   The  subproblem  (SPy)  separates  into  two  parts.   The 
first  part  involves  finding  all  shortest  routes  through  the  network.   The 
second  part  can  be  reduced  to  a  continuous  knapsack  problem.  Unfortunately, 
however,  there  was  no  convenient  way  to  reoptimlze  (SPy) .  For  a  network  with 
M  nodes  and  L  arcs  problem  (DW) ,  written  as  a  single  linear  program,  has 
M(M-l)  +  3L  +  1  constraints  and  2LM  +  3L  variables.   The  details  of  this 
model  are  given  by  Agarwal  [  1  ] . 

For  this  type  of  problem  the  best  performance  was  obtained  by  solving 
each  local  problem  from  scratch.   Thus  constraints  from  previous  local  prob- 
lems were  not  saved.  A  line  search,  as  indicated  in  section  2,  was  performed 
between  successive  boxes.   This  was  done  with  an  adaptation  of  Fisher  and 
Shapiro's  efficient  method  for  concave  plecewise  linear  functions  [6  ]. 

Table  1  summarizes  our  results  for  a  test  problem  with  M  =  12  nodes  and 
L  =  18  arcs.   The  problem  was  run  with  several  different  box  sizes.   Each  irun 
started  at  the  same  point  y  -  a  heuristically  determined  solution  arising 
from  the  interpretation  of  the  problem.  For  each  box  size  6  the  column  headed 
N(B)  gives  the  average  number  of  times  (SPy)  was  solved  per  box.  Notice  that 
this  number  increases  monotonically  as  the  box  size  increases.   For  a  fixed 
box  size,  the  number  of  subproblem  solutions  per  box  did  not  appear  to  increase 
systematically  as  we  approached  the  global  optimum.   The  column  headed  T  gives 
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Table  1.   Solution  of  network  design  test  problem  by  BOXSTEP  (Dantzig-Wolfe) 
with  varing  box  sizes. 


3  (box  size) 

no.  of  boxes 

required 

N(e) 

T  (seconds) 

0.1 

34 

12.7 

172 

0.5 

18 

14.2 

118 

1.0 

13 

17.1 

104 

2.0 

9 

17.7 

88 

3.0 

6 

25.0 

99 

4.0 

4 

26.8 

76 

5.0 

5 

33,4 

134 

6.0 

4 

34.3 

115 

7.0 

3 

38.0 

119 

20.0 

2 

67.5 

203 

25.0 

2 

74.0 

243 

30.0 

1 

74.0 

128 

1000.0 

1 

97.0 

217 
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the  total  computation  time.  In  seconds,  for  a  CDC6400.   All  runs  were  made  with 

£=10"^ 

The  largest  box  (B  =  1000)  represents  the  Dantzig-Wolfe  end  of  the 
scale.   The  samllest  box  (g  =  0.1)  produces  an  ascent  that  approximates 
a  steepest  ascent.  A  pure  steepest  ascent  algorithm,  as  proposed  by  Grinold 
[12],  was  tried  on  this  problem.   With  Grinold's  primal/dual  step-size  rule 
the  steps  became  very  short  very  quickly.   By  taking  optimal  size  steps  in- 
stead, we  were  able  to  climb  higher  but  appeared  to  be  converging  to  the 
value  5097  although  the  maximum  was  at  5665.   The  poor  performance  of 
steepest  ascent  on  this  problem  is  consistent  with  our  poor  results  with 
the  smallest  box. 

Some  additional  computational  results  with  Dantzig-Wolfe  decomposition 
are  reported  in  sections  6  and  7. 
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5.   Application:   Benders  decomposition 

The  dual  or  Lagrange  an  orientation  of  the  previous  section  is  complemented 
by  the  primal  formulation  discussed  in  this  section.   Here  the  function  v(y)  is 
obtained  from  the  application  of  Benders  decomposition  to  a  large  structured 
linear  program. 

A  large-scale  contract  selection  and  distribution  problem  is  formulated 
as  a  mixed  integer  linear  programming  problem  by  Austin  and  Hogan  [  2 ] .   The 
integer  portion  of  the  problem  models  a  binary  decision  regarding  the  inclusion 
or  exclusion  of  certain  arcs  in  a  network.   Embedded  in  a  branch-and-bound 
scheme,  the  bounding  problem  is  always  a  network  problem  with  resource  cons- 
traints which  has  the  following  form. 


(5.1) 


(5.2) 


min  2L. 

X  keK 

subject  to 

keB. 


'k'^k 


\ 


keA, 


\ 


=  0 


ieN 


(5.3) 
(5.4) 

where 

N 
K 

\ 
1, 


\ 


i  h. 


kER 


a,,   X,  <  r, 
Jk   k    j 


keK 
J  =  1 P 


the  set  of  nodes 

the  set  of  arcs 

the  flow  on  arc  k 

the  unit  cost  of  flow  on  arc  k 

the  upper  bound  for  flow  on  arc  k 

the  lower  bound  for  flow  on  arc  k 

the  arcs  which  end  at  node  i 

the  arcs  which  originate  at  node  i 

the  number  of  resource  constraints 

the  arcs  which  are  subject  to  the  resource  constraints 

the  amount  of  resource  j  available 

the  coefficient  of  arc  k  in  resource  constraint  j . 
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Hogan  [16]  has  applied  Benders  decomposition  to  this  problem,  resulting  in  a 

subproblem  of  the  form 

(SPy)     min  Z_  c.x, 
X  keK  ^  ^ 

subject  to 

^     x^-    Z     ^  =  0  ieN 

keB       keA. 

\  -  \  -  \  ^^^"^ 

where  y,  represents  an  allocation  of  resources  to  arc  k.   For  any  vector  y, 
v(y)  is  the  minimal  value  of  (SPy) .  Note  that  there  is  one  variable  y,  for  each 
arc  that  is  resource-constrained.  Let 
(5.5)     Y-'-  =  {y|(SPy)  is  feasible} 
and 


(5.6)      Y^  =  {y\Z-   a,,y,  <  r,  j  =  l,...,p} 


ksR  J^^^  -  J 

2 
Thus  Y  is  the  feasible  region  (in  y-space)  determined  by  the  resource  constraints. 

1  2 
The  set  Y  of  interest  is  then  Y  OY  ,  and  (5.1)-(5.4)  is  equivalent  to  (5.7). 

(5.7)     min  v(y) 
yeY 

The  piecewlse  linear  convex  function  v  can  be  evaluated  at  any  point  y  by 
solving  a  single  commodity  network  problem.  As  a  by-product  we  obtain  a  linear 
support  for  v  at  y.   Similarly,  the  implicit  constraints  in  Y  can  be  represented 
as  a  finite  collection  of  linear  inequalities,  one  of  which  is  easily  generated 
whenever  (SPy)  is  not  feasible.   Thus  v(y)  lends  itself  naturally  to  an  outer  ap- 
proximation solution  strategy.   The  details  are  given  by  Hogan  [16]. 
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The  generation  of  linear  supports  or  cuts  requires  the  reoptimization  of 
the  subproblem.   Since  this  was  relatively  expensive  computationally,  some  of 
the  cuts  were  retained  as  the  box  moved.   This  greatly  reduced  the  number  of 
cuts  that  had  to  be  generated  to  solve  each  local  problem  after  the  first  move 
of  the  box.   In  contrast  to  the  application  presented  in  section  A,  saving  cuts 
paid  off  but  the  line  search  did  not.   The  rule  for  saving  cuts  was  quite  simple. 
Up  to  a  fixed  number  were  accumulated.   Once  that  number  was  reached  each  new 
cut  replaced  an  (arbitrary)  old  cut  that  was  currently  slack. 

Twenty-five  test  problems  of  the  type  found  in  [ 2  ]  were  randomly  gen- 
erated and  solved .   The  networks  had  approximately  650  arcs  of  which  four  im- 
portant arcs  were  constrained  by  two  resource  constraints.   In  each  case  the 
BOXSTEP  method  was  started  at  a  randomly  generated  initial  value  of  y.   The 
mean  B6700  seconds  to  solution  are  recorded  in  Table  2  under  the  column  headed 
T^ .   Test  runs  were  made  for  several  box  sizes.   The  results  indicate  the  su- 
periority of  a  "moderate"  sized  box,  but  the  computational  advantage  gained  is 

not  as  marked  as  with  the  real-life  problem  reported  below.  All  of  these  runs 

-3 
were  made  with  £=10 

Although  the  size  of  the  box  is  inversely  related  to  the  effort  required 
to  solve  the  problem  within  the  box,  the  results  indicate  a  trade-off  between 
the  size  of  the  box  and  the  number  of  moves  required  to  solve  the  overall  problem 
(5.7).   There  is  a  notable  exception  to  this  rule  however.   Frequently,  if  not 
always,  a  real  problem  provides  a  readily  available  prior  estimate  of  an  optimal 
solution  point  y*.  Most  large-scale  problems  have  a  natural  physical  or  economic 
interpretation  which  will  yield  a  reasonable  estimate  of  y*.   In  the  present  ap- 
plication, recall  that  (5.7)  is  actually  the  boimding  problem  in  a  branch-and- 
bound  scheme.   The  v  function  changes  slightly  as  we  move  from  one  branch  to 
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another  but  the  minimizing  point  y*  changes  little  if  at  all.   Thus  we  wish 
to  solve  a  sequence  of  highly  related  problems  of  the  form  (5.7).  Using  the 
previous  y*  as  the  starting  point  on  a  new  branch  would  seem  quite  reasonable. 
Furthermore,  the  size  of  the  box  used  should  be  inversely  related  to  our  con- 
fidence in  this  estimate.   To  illustrate  this  important  point,  the  twenty-five 
test  problems  were  restarted  with  the  box  centered  at  the  optimal  solution. 
The  mean  time  required,  in  B6700  seconds,  to  solve  the  problem  over  this  box 
is  recorded  in  Table  2  under  the  column  headed  T„ .   These  data  and  experience 
with  this  class  of  problems  in  the  branch-and-bound  procedure  indicate  that 
starting  with  a  good  estimate  of  the  solution  and  a  small  box  size  reduces  the 
time  required  to  solve  (5.7)  by  an  order  of  magnitude  as  compared  to  the  case 
with  B=  -H».  Clearly  a  major  advantage  of  the  BOXSTEP  method  is  the  ability  to 
capitalize  on  prior  information. 

A  real  contract  selection  and  distribution  problem  was  solved  for  the 
Defense  Supply  Agency.   This  problem  had  709  nodes,  4837  arcs,  and  6  resource 
constraints  Involving  48  of  the  arcs.  There  were  no  Integer  variables,  how- 
ever, so  only  one  problem  of  the  form  (5.7)  had  to  be  solved.   Two  innovations 
were  made  in  applying  the  BOXSTEP  method  to  this  problem.   The  first  innovation 
was  to  use  a  two-pass  solution  strategy.   The  problem  was  first  solved  with  a 
relatively  large  tolerance  e- .  Then,  starting  at  the  first  pass  solution,  the 
box  size  was  reduced  and  the  problem  resolved  with  a  finer  tolerance  e„.   This 
was  quite  effective,  as  can  be  seen  in  Table  3  where  the  total  solution  time 
in  B6700  seconds  is  given  in  the  column  headed  T. 

The  second  innovation  was  to  use  a  non-cubical  box.  A  different  size  3, 

can  be  selected  for  each  dimension  of  the  box  so  that  we  have 
o   „         o 


(5.5)       yk  ■  \  -  ^k  -  ^k  "^  \  ^°''  ^^^ 
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defining  a  hyper-rectangle  centered  at  y  .   In  this  particular  problem  each 
resource-constrained  arc  appears  in  only  one  of  the  constraints  (5.4).   For 
each  arc  k  appearing  in  constraint  j  (a  ,  =|=0)  ,  we  can  take  3,  as  some  fraction 
of  the  resource  availability  r . .   This  was  done  and  the  results  are  reported 
in  Table  4.   The  best  time  for  this  problem  was  the  1200  seconds  obtained  by 
using  r/20  to  set  the  initial  box  dimensions. 
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Table  2.   Solution  of  25  resource  constrained  network  problems  by 
BOXSTEP  (Benders)  with  varying  box  sizes. 


■if  ^ 

P  (box  size)                T^^  (seconds)  T^  (seconds) 

10^  31.8  4.1 

10^  23.1  5.8 

10^  21.2  14.9 

10^  34.6  25.0 


T^  :  Mean  time  to  solution  using  a  randomly  generated  vector  as  the 
initial  y, 

T2  :  Mean  time  to  solution  using  an  optimal  solution  as  the  initial  y. 
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Table  3.   Solution  of  DSA  problem  by  BOXSTEP  (Benders)  with  2-pass  strategy. 


_1 

::i 

_2 

12 

T_ 

(seconds) 

10-^ 

10^ 

10-^ 

10'^/2 

1450 

10-3 

10^ 

10-^ 

10^ 

1470 

10-3 

10^ 

10-^ 

10^ 

1700 

10-3 

10^° 

>1800 

10-^ 

10^ 

>1800 

No  significance  should  be  attached  to  the  large  absolute  magnitude  of  3 .   The 
optimal  solution  is  of  the  same  order  of  magnitude,  10'. 
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Table  4.   Soultion  of  DSA  problem  by  BOXSTEP  (Benders)  with  2-pass  strategy  and 
rectangular  box. 


li 

_i 

_2 

_2 

T  (seconds) 

10-3 

r/5 

10-^ 

3i/5 

2100 

10-3 

r/10 

10-^ 

e^/s 

lAOO 

10-3 

r/20 

10-^ 

3^/5 

1200 

10-3 

r/50 

10-5 

3,/5 

1300 
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6.   Application:   Dantzlg-Wolfe  revisited. 

Some  of  the  resource-constrained  network  problems  introduced  in  section 

5  have  also  been  solved  by  Dantzig-Wolf e  decomposition  as  developed  in  section 

4.   These  results  will  be  presented  very  briefly.   If  problem  (5.1)-(5.4)  is 

dualized  with  respect  to  the  resource  constraints  (5.4)  the  resulting  subproblem 

is  a  network  problem  parameterized  in  its  arc  costs. 

(SPy)     min  Z-       c  x^  +  ^  [c,    +  Z.   y^^.,  ]x^  -  ^  y.r 
X  keK-R  ^  ^       kER  ^       j=l  J  J*^  '^   j=l  ^  ^ 

subject  to 

Z-   X.  -  ^   X,  =  0  ieN 

keB^      keA^ 

Ji,     <  X,  <  h,  keK 

where  y  is  a  vector  of  resource  prices.  Note  that  there  is  one  variable  y  for 
each  resource  constraint. 

Tests  were  made  with  a  network  of  142  nodes  and  1551  arcs.   Solving  this 
network  without  resource  constraints  ((5.1) -(5. 3))  took  5  seconds  on  a  CDC6400. 
The  resulting  solution,  y  ,  had  positive  flow  on  only  100  of  the  arcs.   These 
100  arcs  were  divided  into  5  groups  of  20  each  and  5  resource  constraints  were 
divised  so  as  to  make  y  Infeasible.   The  resource-constrained  problem  was  then 
solved  by  the  BOXSTEP  method  for  several  different  box  sizes.  As  in  section  5 
the  best  performance  was  obtained  by  saving  old  cuts  and  omitting  the  line  search. 

Table  5  contains  the  results  for  a  series  of  runs,  each  starting  at  the  origin. 

—6 
The  maximum  number  of  cuts  retained  was  20  and  e=10   was  used  for  each  run. 

The  time  is  given  in  CDC6400  seconds.   (For  this  type  of  application  the  CDC6400 
is  between  5  and  6  times  faster  than  the  B6700)  . 

In  addition,  the  25  test  problems  reported  in  section  5  were  solved  using 
the  Dantzig-Wolf e  approach.  This  was  done  on  the  B6700  and  the  counterpart  of 
Table  2  is  xeportad  in  Table  6. 
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Table  5.   Solution  of  resource-constrained  network  problem  by  BOXSTEP  (Dantzlg- 
Wolfe)  with  varying  box  sizes. 


T  (seconds) 

>150 

70 

55 

76 

119 

92 

108 

>150 


BOX  SIZE 

(B) 

NUMBER 

OF 

BOXES  REQUIRED 

.1 

? 

.5 

7 

1.0 

4 

2.0 

2 

3.0 

2 

4.0 

1 

5.0 

1 

1000.0 

1 
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Table  6.   Solution  of  25  resource-constrained  network  problems  by 
BOXSTEP  (Dantzig-Wolfe)  with  varying  box  sizes. 


T   (seconds)  T^  (seconds) 

67.8  3.7 

35.1  4.8 

17.0  9.9 
22.4                      14.9 

27.1  17.0 


T. :   Mean  time  to  solution  using  a  randomly  generated  vector  as  the 

initial  y. 
T„:   Mean  time  to  solution  using  an  optimal  solution  as  the  initial  y. 


B 

(b 

ox  size) 

10° 

10^ 

10^ 

10^ 

10^ 

-27- 

7.   Application:   Subgradient  optimization 

A  very  promising  new  approach  to  the  class  of  problems  we  have  been  con- 
sidering was  initiated  by  Held  and  Karp  [13]  in  their  work  on  the  traveling 
salesman  problem.   This  approach,  called  subgradient  optimization,  has  been 
further  developed  by  Held  and  Wolfe  [14]  and  Held,  Wolfe,  and  Crowder  [15J. 
The  problem  addressed  is 

(7.1)  max  v(y) 
yeY 

where  Y  is  convex  and  v(y)  is  defined  as  in  (3.1)  as  the  pointwise  minimum  of  a 

family  of  linear  functions. 

(7.2)  v(y)  =  min   (f Vy)  . 

keK 

If  v(y  )  =  f    +g    y,  then  it  is  well  known  that  g     is  subgradient  of 
V  at  y  .   Subgradient  optimization  consists  of  the  following  iterative  process, 

starting  at  any  point  y  : 

/_  _.       t+1   „  f   t,    k(t)^ 

(7.3)  y    =  ^^{.y   +s^g    ) 

where  P  is  the  operator  projecting  R  onto  Y,  g     is  a  subgradient  of  v  at  y 


Z' 


and  {  s  }  ^^  is  a  sequence  for  which  s  ->0  but  /—    ^^   s  =  00 .   The  convergence  of 
this  process  is  discussed  in  [14].  We  note  here  only  that  the  sequence  v(y  )  is 
not  monotonic.   In  practice  it  has  been  found  that  (7.3)  can  produce  a  near 
optimal  solution  of  (7.1)  very  quickly.  Achieving  complete  optimality,  which 
is  important  if  (7.1)  is  the  dual  of  the  problem  of  interest,  is  often  consider- 
ably more  difficult. 

Evidently  subgradient  optimization  and  the  BOXSTEP  method  would  compliment 
each  other.   Subgradient  optimization  can  get  close  to  the  global  optimum  quickly, 
and  BOXSTEP  works  best  when  the  global  optimun  is  somewhere  in  the  first  not-too  - 
large  box.   Any  point  y  produced  by  (7.3)  can  be  used  as  the  starting  point  for 
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BOXSTEP.   Furthermore,  the  subgradients  that  have  been  generated  provide  an 

initial  set  of  linear  supports  for  BOXSTEP. 

This  idea  for  a  hybrid  algorithm  was  implemented  for  the  p-median  problem 

(see  Marsten  [l8]  or  ReVelle  and  Swain  [21]).   The  continuous  version  of  this 

problem  has  the  form 

n    n 

(7.4)  min  ^   ^   ^l-^i^ 

i=l  j=l   "-J  ^^ 

subject  to  ' 

n 

(7.5)  :Z  X   =  p 

j=l  ^^ 

(7.6)  ^ij  1  ^jj  ^°''  ^^^   ^+J 

(7.7)  0  <  x^  <  1  for  all  i,j 

n 

(7.8)  X  X   =  1  for  i  =  l,...,n 
j=l   ^ 

Dantzlg-Wolfe  decomposition,  as  developed  in  section  4,  was  applied  to  this 
problem.  Dualizing  with  respect  to  the  constraints  (7.8)  results  in  a  Lagrangean 
subproblem  for  which  a  very  efficient  solution  recovery  technique  has  been 
devised  by  Blankenship  [ 3  ] .   The  starting  point  for  BOXSTEP  was  obtained  by 
making  250  steps  of  the  subgradient  optimization  process  (7.3).   The  sequence 
{s  }  was  taken  as  5  repetitions  of  AO/t  for  each  t  =  1,...,50.   The  results 
are  given  in  Table  7.   The  column  headed  p  gives  the  median  sought,  as  specified 
in  (7.5).   The  columns  headed  L^^s  ^^'^   ^250  8-^^®  ^^^   maximum  value  of  the 
Lagrangean  found  during  the  first  125  and  250  steps,  respectively.   L^^^  is  the 
true  maximum  value  of  the  Lagrangean.   T^  and  T_  give  the  time  devoted  to  sub- 
gradient  optimization  and  BOXSTEP,  respectively,  in  CDC6400  seconds.   The  last 
column  gives  the  number  of  boxes  used  during  the  BOXSTEP  phase.   This  test 
problem  had  n=33  and  the  box  size  3=1  was  used  in  each  case.   Up  to  50  cuts 
were  retained  and  no  line  search  was  used.   For  p=2  and  p=4  an  optimal  solution 
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and  zero  subgradlent  were  obtained  very  quickly,  making  the  subsequent  steps 
and  BOXSTEP  solution  superfluous.   In  the  remaining  cases  BOXSTEP,  by  using 
more  than  strictly  local  information,  was  able  to  find  and  verify  the  optimal 
solution  after  the  subgradient  optimization  process  had  slowed  down.   These 
results  are  very  encouraging  and  further  experimentation  is  underway  with 
other  types  of  problems. 
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Table   7  .   Results  for  p-median  problem  with  hybrid  algorithm 


L 

^125 

S50 

2 

17474.0 

17474.0 

3 

14538.7 

14622.5 

4 

12363.0 

12363.0 

8 

7449.1 

7454.0 

9 

6840.9 

6843.6 

10 

6263.8 

6265.4 

11 

5786.8 
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8.   Conclusion 

The  BOXSTEP  method  replaces  a  given  convex  optimization  problem  with 
a  finite  sequence  of  presumably  easier  local  problems.   If  each  local 
problem  is  solved  by  outer  approximation,  then  BOXSTEP  can  be  viewed  as 
creating  an  algorithmic  continuum  between  feasible  directions  methods  and 
cutting  plane  methods.   Like  the  former,  BOXSTEP  can  take  advantage  of  a 
good  starting  point.   Like  the  latter,  BOXSTEP  can  capitalize  on  acciomulated 
knowledge  of  the  function  being  optimized. 

Our  computational  work  to  date  has  focused  on  problems  obtained  by 
applying  Dantzig-Wolf e  or  Benders  decomposition  to  large  specially  structured 
linear  programs.  The  traditional  approach  to  these  problems  amounts  to 
using  one  Infinitely  large  box  centered  at  the  origin.   Our  results  show 
that  performance  can  be  improved,  often  dramatically,  by  using  a  sequence 
of  "moderate"  sized  boxes.   The  meaning  of  "moderate"  is  highly  dependent 
on  problem  structure  and  scaling  but  is  not  difficult  to  determine  for  a 
given  problem.   The  main  avenues  for  future  work  appear  to  be  the  following. 
Structured  linear  programs;  Many  other  problems  for  which  decomposition 
has  failed  in  the  past  need  to  be  reinvestigated.   This  is  especially  true 
when  BOXSTEP  is  considered  in  conjunction  with  subgradient  optimization 
as  in  section  7. 

Structured  non-linear  programs;   BOXSTEP  has  yet  to  be  tried  on  any  of  the 
non-linear  generalizations  of  Dantzig-Wolf e  or  Benders  decomposition.   See, 
for  example,  [8,9]. 

General  non-linear  programs;   In  the  case  where  v(y)  is  an  explicitly  avail- 
able concave  function,  BOXSTEP  could  perhaps  be  used  to  accelerate  the 
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convergence  of  cutting  plane  algorithms.   Some  of  the  BOXSTEP  ideas  might  be 
used  to  enhance  the  performance  of  MAP  [4,  11,20]  procedures. 

Integer  programming;   Geof f rion  [lO ]  and  Fisher  and  Shapiro  [ 6  ]  have  recently 
shown  how  the  maximization  of  Lagrangian  functions  can  provide  strong  bounds 
in  a  branch-and-bound  framework.   The  BOXSTEP  method  may  find  fruitful 
applications  in  this  context.   It  has  the  desirable  property  that  the  maximum 
values  for  successive  boxes  form  a  monotonically  increasing  sequence. 

We  have  also  made  only  the  most  rudimentary  implementation  of  the  method. 
In  all  of  the  experiments  reported  here  we  have  used  a  box  of  fixed  size  and 
have  insisted  upon  optimizing  each  local  problem  (at  least  to  within  e) . 
Experiments  currently  underway  indicate  that  better  results  can  be  obtained 
by  changing  the  size  and  shape  of  the  box  at  each  step  and  by  suboptlmizing 
most  of  the  local  problems.   This  will  be  reported  in  a  subsequent  paper. 
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